Large-Scale Sparse Kernel Logistic Regression — with a comparative study on optimization algorithms

نویسندگان

  • Zhiwei Qin
  • Bo Huang
  • Shyam S. Chandramouli
  • Junfeng He
  • Sanjiv Kumar
چکیده

Kernel Logistic Regression (KLR) is a powerful probabilistic classification tool, but its training and testing both suffer from severe computational bottlenecks when used with large-scale data. Traditionally, L1-penalty is used to induce sparseness in the parameter space for fast testing. However, most of the existing optimization methods for training l1penalized KLR do not scale well in large-scale settings. In this work, we present highly scalable training of KLR model via three first-order optimization methods: Fast Shrinkage Thresholding Algorithm(FISTA), Coordinate Gradient Descent (CGD), and a variant of Stochastic Gradient Descent (SGD) method. To further reduce the space and time complexity, we apply a simple kernel linearization technique which achieves similar results at a fraction of the computational cost. While SGD appears the fastest in training large-scale data, we show that CGD performs considerably better in some cases on various quality measures. Based on this observation, we propose a multi-scale extension of FISTA which improves its computational performance significantly in practice while preserving the theoretical global convergence rate. We further propose a two-stage active set training scheme for CGD and FISTA, which boosts the prediction accuracies by up to 4%. Extensive experiments on several data sets containing up to millions of samples demonstrate the effectiveness of our approach.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Efficient Online Learning for Large-Scale Sparse Kernel Logistic Regression

In this paper, we study the problem of large-scale Kernel Logistic Regression (KLR). A straightforward approach is to apply stochastic approximation to KLR. We refer to this approach as non-conservative online learning algorithm because it updates the kernel classifier after every received training example, leading to a dense classifier. To improve the sparsity of the KLR classifier, we propose...

متن کامل

Kernel Logistic Regression Algorithm for Large-Scale Data Classification

Kernel Logistic Regression (KLR) is a powerful classification technique that has been applied successfully in many classification problems. However, it is often not found in large-scale data classification problems and this is mainly because it is computationally expensive. In this paper, we present a new KLR algorithm based on Truncated Regularized Iteratively Reweighted Least Squares(TR-IRLS)...

متن کامل

A Fast Hybrid Algorithm for Large-Scale l1-Regularized Logistic Regression

l1-regularized logistic regression, also known as sparse logistic regression, is widely used in machine learning, computer vision, data mining, bioinformatics and neural signal processing. The use of l1 regularization attributes attractive properties to the classifier, such as feature selection, robustness to noise, and as a result, classifier generality in the context of supervised learning. W...

متن کامل

Fast Implementation of l 1 Regularized Learning Algorithms Using Gradient Descent Methods ∗

With the advent of high-throughput technologies, l1 regularized learning algorithms have attracted much attention recently. Dozens of algorithms have been proposed for fast implementation, using various advanced optimization techniques. In this paper, we demonstrate that l1 regularized learning problems can be easily solved by using gradient-descent techniques. The basic idea is to transform a ...

متن کامل

Fast Implementation of ℓ1Regularized Learning Algorithms Using Gradient Descent Methods

With the advent of high-throughput technologies, l1 regularized learning algorithms have attracted much attention recently. Dozens of algorithms have been proposed for fast implementation, using various advanced optimization techniques. In this paper, we demonstrate that l1 regularized learning problems can be easily solved by using gradient-descent techniques. The basic idea is to transform a ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011